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AgSTRACT 



Research reports on the nature of speech, 
instrumentation for the investigation of speech, and practical 
applications of speech research are included in this status report 
for the April 1-June 30, 1981, period. The 14 reports deal' with the 
following topics: (1). electromyography as a technique for laryngeal 
investigation, (2) the phonatory mechanism,. (3) phonetic- perception 
of sinusoidal signals, (4) memory for item order and phonetic 
recoding In the beginning reader, (5) perceptual equivalence of two 
kinds of ambiguous speech stimuli, (6) perceptual targets an<l 
production rules,' (7) orthographic variations and visual information 
processing, (8) visual word recognition in Serbo-Croatian, (9) word 
recognition with mixed-alphabet forms, (10) intrhlanguage versus 
interlanguage Stroop effects in two types of writing systems, (11) 
•categorical perception of, English M r” and "1" sounds by Japanese 
bilinguals, (12) the influence of vocalic context on perception of 
•the 'tsV"z M distinction and two "ways of avoiding' it, (13) grammatical 
p^Ujing of inflected nouns, and (14) an evaluation of the "Basic 
Orthographic Syllabic Structure” in a phonologically shallow . 
orthography. (FL) , 
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I. MANUSCRIPTS* AND EXTENDED REPORTS .. 



\ 

ELECTROMYOGRAPHY AS A TECHNIQUE FOR LARYNGEAL INVESTIGATION* 

' ' ' V 1 

Katherine S. Harris+ • , . 



r 




■While, as earlier papers at this conference have 'indicated, -the .forces 
that determine laryngeal adjustment are complex, muscular forces are extremely 
important. ’ In recent years, techniques for studying muscle activity in 
general have improved, and with- these developments, the study of the laryngeal 
muscles in normal- alert humans has become, possible using the techniques of 
electromyography. In this paper, I will discuss some properties of'muscles, 
and of the laryngeal muscles in particular , J techniques for EMG recording, and, 

■finally some results of studies on the muscul&r control of the lqrynx. » 

* * * 



C . ✓ I 

■ * , MUSCLE PROPERTIES 

The building block for' a consideration of muscle activity is the motor 
unit. This term was coined by Liddell and Sherrington (1925) to include tj^e 
motoneuron and the muscle fibers it supplies. *The contractile response to one. 
impulse in one^ motor neuron is*a twitch contraction in the innervated muscle' 
fibers. Thus , Che^sgial.lest unit of muscular activity, is a contraction of the 
muscle fibers of a single motor unit, and the smoothly graded contraction of a 
muscle is accomplished by temporal an<r x spatial' v summation of t.he activity of a 
nearer of motor units. j £> • ' 

. .The musdles of the .body have' somewhat different tasks, and their 
properties are well-correlated with these tasks. For example ," some muscles, 
such ^s the muscles of the. finge'rs, muit make finely tuned mbvements, while' 
other's, such as those of the leg, must support the body against the force's of 
gravity for long periods- of time. These muscles differ in the size of- their 
motor units, and ift the histochemical properties ‘of the individual muscle 
fiber properties that determine their resistance to fatigue. 



'i ■ ■ ^ . / 

V Table 1 presents some data on motor unit size in the intrinsic -l.ahyngeal 
muscles, with data on one of the eye muscles and the biceps, for comparison. 

^ f . ‘ ■ <■ 



*A version, of this paper was presented at the Conference on Asssessment .of 
Vocal Pathology, ^ethesda, Md., April 1979- (Proceedings to be 'published in 
ASHA Reports .),<■ ~ * 

+Also Graduate Center, City University of New York. 

Acknowledgment. This^ work was* supported . by ■ NINGOS Grants * NS 1387-0 and 
. SS13617, and BRSG Graht RR055.96. . • 
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Table 1 



Data on the Innervation Ratio of the Intrinsic Laryngeal Muscle* with 
Some Comparison Information on One of the Eye Muscle! and ihe 



Source 



man (Faaborg-Andersen , 
1-957 ) 

nan (English & Bleveos, 
1969) 0 

« (' 

cat > (English & Blevens 
1969) 

t t 1 

man (Buchtal, 1973) 



CT 

• *166 
'30 
55 



’ 'Larynx 
TA IA 



247* 



.90 



PCA 



116 



LCA 



64 






Other 
Rectu^ 
Oculi 
Lateralis 



13 



i* is 



CT 

TA 

IA 

PCA 

LCA 



»Cr icothyroid 
Thyroarytenoid 
Interarytenoid 
Posterior cricoarytenoid 
Lateral cricoarytenoid 
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While different authors have found differences in the number of* fibers in a 
motor unit, there is 'a .general .agreement that the laryngeal muscles have low 
innervation ratios, though not quite so low. as those of the eyebal*! and middle, 
ear; tlje muscles of the limbs and trunk have generally* far higher ratios. 

The muscle fibers themselves consist of a nimtber of myofibrils, made* up, 
in turn, of a parallel^ overlapping array of actin and myosin ‘filaments. In 
contraction, the actin and myosin filaments slide relative to each other*, so 
• that the muscle shortens' and develops tension. In nbrmal physiological 
conditions, this shortening*, is initiated by the release of a chemical 

transmitter, acetylcholine, at the jierve-muscle .junction , the* motor end plate. 

* C if* 

When a muscle fiber is at rest, £here is a potential difference across' 
the cell members of about -90 mV, due to the difference, in its permeability to 
sodium and potassium' ions . When a nerve impulse reaches the motor’end plate, 
acetylcholine is released, which changes the permeability of the membrane to 
sodium and potassium ions. If this, ’depolarization reaches sufficient Levels, 
the change in potential becomes self-regenerating, and travels along the 

muscle fU>er. During the passage of this action potential, the membrane 
potential rises, then reverses its sign and finally returns to its resting 

value of -90 mV. The movement of ions, and the associated changes in* 

potential^ are, of course, the events generating the electromyographic signal. 
The ionic, currents at the' membrane apparently release caleium ions within the 
muscles; the diffused calcium , activates the contractile compotient * of .the 
„ musc l e » producing the mechanical effect of muscle shortening or tension 

development (Carlson & Wilkie, 1968). ’. • 

f * « 

While the fibers i of striated muscles share many properties, they show ' 

some -adaptations to their individual tasks. The muscles of the larynx must be , 

. designed foj" rapid adjustment; however, because of their .participation in* 

respiration, they^ must have some capacity * for sustained activity without 
fatigue. Muscle fibers are of two basic types, red ahd white ,’ although there | 
are variants in different systems in different 'animals. The "red" and "white" 
designations refer to a difference in the fiber color, familiar from the light 
and dark meat of- chicken. The two types differ in their metabolic properties, 
with red muscle more suited* to sustained contraction due to the fatigue 
resistance and white morje suited to rapid phasic contraction. Most muscles of « 
•the body, including the muscles of the larynx, show mixed red and white 
fibers. Any single motor unit, however,, is composed of fibers, of* a uniform 
type (Brandstater & Lambert, 1973) although-, since adjacent motor units have 
overlapping territories, a cross-section of a muscle will show a checkerboard 
pattern of- red and white. ' > ’ 

Biochemical and histological studies of the laryngeal muscles to that 
date (1970) were summarized by Sawashima. He concluded that, with respect to 
metabolic properties, the intrinsic laryngeal muscles as a group appeared to 
- pe intermediate between skeletal and heart muscles. However, he found 

disagreements among' the authors he reviewed as to similarities and dissimilar- 
ities within the group. • , - * 



' Since that review, there have been further studies of the histochemist 
of the intrinsic muscles of the larynx. Data from one of them (Edstro 
Lindquist, & Martensson, 1974) are shown in Table 2, showing the percentage's 




r 



' Table 2 " _ 

1 

, « 

Data oji Histbchemical Properties of the Intrinsic Laryngeal Muscles 
in Cal, af,ter EdstrOm,- Lindquist, and Martensson -0 974) 



i ' 



TYPE -I ’ 
( 1 ) ( ( 2 ) 



TYPE II 
( 1 ) ( 2 ) 



(3) 



Fiber .typer in -skeletal muscle 
(Kugelberg, 1973) 



1 V, 



IIA 



IIB 

IIC 






‘Overall % if) laryngeal muscles,, 
with most ^ommon subtype starred 




Table 3 






Data from Atkinson (1978) on/the Mean Repponse Time for Some Intrinsic 
* . and Extrinsic’ Laryngeal Muscles 

• u ' ■ 

*>• -• • . 
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•Of Type I and Type II (red and white) fibers found Tor each of the four 
laryngeal muscles examined. 'While some of the ‘fibers were like Type I and 
Type II fibers found in limb muscles, bthers *were' variants of previously 
identified types. It is interesting . to note that Type II variants 'are far 
more common in the thyroarytenoid than in the cricothyroid. . « 

/ ' I ' . 

A second study (Sahgal & Hast, 1974). examined the hfrstochemical reactions 
to ATP and three ^oxidative enzymes in cricothyroid, and^thyroar ytenoid . The 
results show, some difference* between th§' muscles, which the authors believe 
are I a T^o — cel^ec^, to the differences in the speed of contraction «of the 
muscles. " , 

•• w \ - • , - ' 

Thus, differen ces : in the histochemistry of the muscles appear to 
reflected in their contractile properties. We have seen that the laryngeal 
muscles are composed predominantly o'f Type II fibers, 'like the intraocular 
muscles in man - (Kugelberg , 1973). The laryngeal muscles .are generally agreed 
to be fast muscles, although different authors have obtained different values 
foi? their contraction «time, the time from nerve or muscle stimulation to the • 
peak of the, muscle tension. Figure 1. adapted from Sawashima's review (1970), 
sun^rizes^ the results. The thyroarytenoid is consistently found to be faster, 
than the cricothyroid*, which i's consonant with the difference in proportion of 
Type II fibers in the two muscles and .-'according to Sahgal and Hast (1974), 
with the difference in. their histobhemical properties. 

v* i 

Contraction time for ’the intrinsic laryngeal muscles has' befen" estimated 
'by a very different tec,femique b.y*Atkinson ( 1978) at Haskins’ Laboratories . He 

• reasoned that, if a causal relationship between f Q and the EMG activity of 
various laryngeal muscles were assumed, there, should -be a correlation between 

f o anc * gposs EMG activity, at some time delay determined by the mechanical 
properties of th.e muscle. Thus, cross-correlation analysis £hpuld provide 

• clues to relative contraction time. , i 

. . ' ( . ' . ■ 

He asked speakers to "produce .sentences -varying in stress and intonation," 

thus varying T Qt and cross-correlated average f 0 and rectified and averaged 
EMG activity, at varying delay times. Table 3 shows the deCLcry times at which- 
correlation .reached peak value for different muscles. The finding of shorter 
mean Response time for, thyroarytertoid' and. lateral cricoarytenbid- than for 
cricothyroid,, with^Jonger response times for 'the strap' muscles, is like the 
result's obtained by more conventional techniques, summarized in Figure loan’d 
also parallels the hi stochemical -grouping of TA with LCA, shown 'in Table 2. 



. THE ELECTROMYOGRAPHIC SIGNAL 

— — — T * * 

^ The origin of' the electromyographic signal is discussqd above in only 
very general terms. If' the signalsjfrom the Taryhgeal .muscles' are to be- 
considered in detail, the recording ' procedure itself must b.e discussed. 
Figure 2'(Geddes, 1972) shows f a muscle with a pair of recording electrodes on 
its surface. The fibers are aligned parallel, to each other. When a muscle 
fiber or the nerye is stimulated, a wave of depolarization- passes along'-each 
stimulated fiber. However, since each recording electrode is most sensitive 
to the fiber closest to it, the event recorded will be weighted bv -the 
distance between the pickup and the active fiber, as shown in the figure. As 
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igure't. Contraction time 'in msec for various laryngeal muscles. ) This 
. figure is adapted in part from Table 1 , Sawashima, 1 970 . 
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Figure 2. Schematic diagram of electromyographic' recording. In part (a), two 
electrodes are shown positioned <over six' muscle fibers. In (b), • 
-. # thA^stmuned potential differences ^re shown for electrodes A and B, 

• _ ; with the contributions from each fiber, and their • difference. 

: 'Reprinted ’ from Geddes, - 1972. • J 
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the wave of depolarization sweeps down the fibers and reaches the second 
electrode, it becomes negative. The event recorded also reflects the timing 
of the^ action potential passage at the two electrodes and the size of the 
recording surfacg. In the example shown, there is a period when the'fiber is 
depolarized, under both' electrodes-; hence, the signal returns -to zero before 
reversing f *its sign. Another factor determining the signal picked up by the 
electrodes is the intervening tissue. In- general, the presence of tissue 
creates a low-pass filtering effect whose* bandwidth decreases as distance 
increases (DeLuca,- 1978). 



While it is possible to record from a single muscle fiber (Efcstedt’& 
StSlberg, 197 3).. the^ more us,ual recording represents events in a motor unit,* 
br an aggregate of motor units. Under’ normal conditions, an actiqn' potential 
propagating down a motor nerve activates all the . fibers of its motor unit. 
The fibers of a single motor unit are intermingled .with each other in such a 
way *that the territcfcy of one unit^is about 20 'times the cross-sectional area 
of the fibers of the unit .(Buchthal, Erminio,& Rosenfalk, 1959). Since a 
portion of a 'muscle might contain fibers belonging to any of fifty' motqr 
units', an electrode in the vicinity might detect . activity in any or all of 

them. The signal reaching a pair of electrodes in active tissue is the 

‘ weighted sum of the activity of each j > f the fibers of a motor unit, with the 
filtering properties of the tissue between the electrode and the active^fiber 
taken into Account. Since the orientation of the fibers of each motor unit 

with respect to , a fixed recording site will be unique, the shape of the 

. resulting recorded action potential will similarly be unique,' and can be used 
to recognize the unit (LeFever, 1980). * • ' 



P • • ^ V 

When a muscle is activated, the electrical manifestation of a motor Vunit 

action potential is accompanied by a twitch of the activated fibers. Iri 
muscle contraction in physiological conditions, the motor units are repeatedly 
activated K whether the type of contraction is isometric (the muscle, does not 
shorten, but develops tension) or anisometric- ( the muscle shortens). 



THE ELECTRODE . 

In recordings from thq laryngeal muscles, or any others, it is often N 
possil^e £o recognize individual motor units by visual inspection, especially 
'when Revels of contraction are low, /so that only a few motor units are active. 

An example is shown in Figure 3» \a recording from the cricothyroid muscle 
( Faabqt»-£ncle-Rsen , 1964). Alternatively, it is p“ossible\to record from sucn^a 
large?, ndmber of active fibers that individual components cannot be recognizedV 
as -in Figure 4. The signals shown *here - are a so-called "interference 
patte/n." That i,s, the pattern represents the activity of a large number of 
• fibers. experimenter may wish to record single motor units' or interfer- 

ence patterns depending on the pur'pose of the experiment, and makes a choice 
o.f electrode accordingly. , * ' •* 

^ Three general types of electrodes have 'been used in speech research; 
surface, needle, and hooked wire electrodes. Of these ,. hooked wire electrodes 
have been most useful for recording from the laryngeal muscles. The muscles 
of the larynx are aligned in a way that signals picked up by an. electrode on 
the neck surface Sre ambiguous as to, which muscle is the signal source. Thus, 
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Figure 3. Action potentials of a single motor unit during phonation. A. 

Cricothyroid muscle. B. Microphone recording. ' Reprinted from Dv 
Brewer, T964. . * -> 
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Figure 4^ Quiet respiration. The onset of inspiration is indicated by the 
vertical, stipplgd lines/ A and B: Cricothyroid muscle. C and D: 

Vocalis muscle. E: Posterior cricoarytenoid muscle. ' Reprinted 

from D. Brewer, 1964. v „ <» 
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although attempts havo been made to use 'surface recordings from locations over 
.the thyroid cartilage in a biofeedback, application ^Guitar 7 -S^ 52 _^ it seems 
unlikely*' that much further application will be made pf such techniques. 
Needle electrode insertions into the laryngeal muscles are not generally 
feasible for posterior cricoarytenoid and interarytenoid muscles, although 
such insertions were used by Faaborg-Andersen in his classic study. The work 
of the past decade was done almost entirely with hooked wire electrodes, 
except for some clinical work to be described by Hirose. 

Figure 5 shows the classic version of the hooked wire electrode (Basmaji- 
an & Stecko, 1962). Some technical details and possible variants of this type 
9f electrode are discussed by Basmajian (1978). This type of electrode has 
been used in recording from the laryngeal muscles by a .number of investigators 
besides ourselves (Hirano & Ohala, 1969; Shipp, Fishman, & Morrissey, 1970). 
Osing them, we hav£ been able to record from all of the intrinsic laryngeal 
muscles (and a wide variety of other speech muscles) using techniques 
developed collaboratively with Dr. Hajime Hirose and his colleagues at the 
Institute of Logopedics and Phoniatrics at the University of Tokyo (Hirose, 
Gay, & Strome , 19fl ) . 

If the investigator is interested in^recording from a very srtlall volume 
of tissue, the recording surfaces of the electrodes must be made as small as 
.possible, while iT the Investigator is interested in a representation of the 
\activity‘ of the whole muscle, *the recording surface must be as large as 
possible, while still remaining within the confines of the same muscle. 
Obviously , since the laryngeal muscles are small* some conventional configura- 
tions of electrode may record actiyity from/tnore than one muscle (Dedo & 
Dunker, 1966). In the conventional hooked wire electrode, the hooks,* which 
hold .the wire in the muscle, also act as the recording points for the bipolar 
pickup, through their- cut ends. However, the spacing between the, two points 
is set arbitrarily by the way^ that the electrode happens to hook into the 

muscle, and, indeed, may change within the recording session (Jonsson & Komi, 
1973). ‘ Since,, this type of electrbde apparently records from a very small 
volume of tissue, the fact that the distance between the -electrode tips is not 
fixed seems a design .flaw. At Haskins, we have been exploring the various 
designs in which the functions of stabilization and recording are separated, 
and the field size is fixed by the separation between recording points. 

PROPERTIES ' OF MOTOR UNITS 

* Exploring the relationship between ideal electrode and experiment re- 
quires ‘a systematic discussion- of the events within a muscle ,as we now know 
them, largely from studies of .‘limb ‘muscles. Most issues of 'muscle cbfarac- 

teristics have only been explored with a limited number of muscles. 

. Let us begin with the’ single motor unit. - In constant force contractions, 
it will fire with an overall mean interspike interval an'd ('standard' deviation 
CDeLuca & Forrest, 1973; Figure 6), which can be used to characterize the 

unit, and, perhaps, the muscle itself. MacNeilage (1973) has.^shown that 
single motor units from CT and PCA fire at mean frequencies of ^bout 15 

impulses per se'cond , during low frequency phonation. He suggested that these 
rates were - intermediate 'between rates for limb and trunk and intraocular 
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Figure 6. Distribution 6f interpulse intervals from a single' motor unit. 
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musculature, as we might expect from these other properties^. "However, he-, 

found no evidence for the different kinds of units, tonic and kinetic; 

'postulated by Tokizane and Shlmazu (1964), to be identifiable' on the basis of 

.the relationship between variability and firing rates (MacNeilage, .Sussman, & 

Powers^ 1977). -Other authors^DeLucd & Forrest, 1973; Hannerz, 1974; Leifer, 

1969) have ,\f-ound continuous distributions of single unit, properties for 

various limb ’muscles. ’ ...*•. 

* / 




During force-varying isometric contractions, there is a complex relation- 
ship between variation in firing rate and recruitment. At low forces, ** force 
tends to be increased by the recruitment of additional units, with successive- 
ly recruited units having' higher firing rates at recruitment. As force 
increases, individual units increase firing rates, and at the Highest for-ce 
levels, very little recruitment occurs. Synchronization of firing of units 
may occur ^a^t be m uscle fatigues (DeLuca, 1978). * , 

The most consistent observation 5 of motor unit ( behavior is the relation^ 
ship between the size of the unit, and force output and order of recruitment 
with increasing muscle force, the "size principle"- (Henneman, 1975). While.- 
this- relationship has not been observed, for any„,of the laryngeal muscle^, it 
has.J>een demonstrated for the masseter in humans (Yemra^-I^?) and' for - the 
anterior belly of the ^ digastric by MacNeilage, Sussman,. Westbury, and Powers 
(1979), aryl there is no reason -to believe that the^ laryngeal muscles behave in 
a very unusual way in this respect., 'However, for all muscles, there is some 
question a£ to whether there are reversals of recruitment order for” rapid, 
anisometric contractions. * 



A 

Since the territories ,of motor units overlap with increasing forces of 
contraction, it is increasingly difficult to identify individual units. Fc* 
•studies of such questions, electrode size must be reduced, and sophistical 
programs for the identification of motor units developed (LeFever, 1 980 ) . 



<rTHE INTERFERENCE PATTERN 

* ' *1 * 

. Most electromyographic studies of the laryngeal muscles have been con- 
cerned, not with t(ie properties ^of individual motor units, but with the 
functions' of the muscles as a whole. Typically, the studies have related the 
characteristics of a given muscle activity* to some sort' of output, such as 
pitch. The electromyographic signal studied is usually .an Interference 
pattern, the signal from a large number of motor - units. As an aid in . 
visualization, it is interesting to look at a synthesized interference 
pattern, Figure 7 (LeFever & DeLuca, personal communication). The figure, 
shows 20 motor units of shapes that would be characteristic of those found in 
an electrode field during a constant force-, isometric contraction.' Their 
sizes and the relative extent of positive and negative deviations from 
baselinq vary with distance from and orientation to the electrode. The sum of 
positive and negative deviations is’ shown* in the bottom iTine of the figure. 
Obviously, there is ‘summing and cancellation of signals* from individual units, 
depending on *their phase relations. , * Thj^resultant signal-^s noisy, and 
difficult to deal- with 'quantitatively. If«the electrode- size is deduced , so 
that fewer units- ard" represented in the signal; the interference, pattern '* 
becomes more variable as a function of time (figure 8A). 
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& number of steps must be taken to deal with. such, signals.* The jusqal • 
s approach has been to rectify and integrate. The effects of rectification are 
shown'in Figure 8B. The traditional use of the rectified and integrated EMG * 
signal is based on a large body of research investigating the 'relationship 
between the magnitude of thk EMG signal so obtained and the. force output of” 
the muscle ('Bigland & Lippold, 1.954; Bouisset, 1973 * Bouissek &' Maton, 1973; . 
Inman, Ralston, Saunders, Feinstein,. & Wright, 1952; Lippold, 1952; -.Zuniga & 
Simona-, -1969). This measure X "integrated ,EMG") varies roughly linearly with 
force, for isometcic contractions at moderate force letf^ls, but at higher 1 
levels of forc^.the relationsip bfecomes- nonlinear . The situation becomes 'far 
more'- complex for anisom.etric "Contractions, in part' because the mechanical 
efficiency 6f a muscle' defends on its length as well as its, ’velocity of 
shortening or lengthening. Since the events of iriteresj. in speech research ‘ 
are typically of this latter sort, we can expect the magnitude of the EMG 
signal to provide-.rio more than an overall j*ndex of mechanical* performance . 

A possibility that we have ’explored informally at Haskins is osculating 
1 ^ he variance of the interference pattern, which is equal to the sum of the 
variances of the motor unit acEion potential trains contributing, and hence, 
does not' lead to the loss of contributions of motor units due to cancellation 
^s does the more conventional measure. —• 

V _ * 

'* We have said., very little about the time constant , to be uspd for 

< integration. We use a 5 millisecond hardware integration window and smooth 

TVC fc l? er algebraically, using software programs in which a time constant may be 
chosen .. Individual tokens recorded with hooked-wire electrodes show sizable 
' that are* not represented ia the mechanical ojut^ut of the mpscle 

ay a whole. For speech, time-smoothing is useful only to the point where it 
does , not -obscure the sequencing of- underlying articulatory events. An 
alternative way of smoothing is ensemble averaging. The affects of time- 
smoothing and ensemble -.averaging are shown in Figure 9, Which shows ‘ averaged 
and. integrated signals from repeated utterances. The details ■'of these 
analysis procedures are discussed at greater le.ngth in laboratory reports 
IKewley-Port , 1973,' 1974)-. 
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\ LARYNGEAL MUSCLE STUDIES 

* ^ 

Having reviewed the general properties of^ muscles, and oj the laryngeal 
> muscles in particular, as well as some technical problems, we turn now to .the j* 
results of electromyographic studies- o,f the function of these muscles ^TftT 
speech. The most primitive question, is, perhaps, -what muscles should be „ 
considered as laryngeal muscles;? Traditionally, .the muscles of the larynx v 
. have been divided into two groups, intrinsic and extrinsic. The identity of * ' 
• the’intrinsic muscles is readily agreed upon; they are the cricothyroids (CT), 

^he thyroarytenoid s (TA) , the interarytenoids (IA), the lateral cricoaryteno- 
.ids (LCA), and the posterior ^cricoarytenoids (PCA). The identity of the 
.extrinsic laryngeal muscles^ is mor,e difficult to specify. If we take the 

empirical point of view that any muscle that affects the positions of thyroid ; 
cricoid-, and arytenoid cartilages relative to each other may be considered to 
be an extrinsic.; laryngeal muscle, then a wide variety of^muscles, not normally 
considerecf in relation to the larynx, must be included! For example, Painter 
(1978) has produced some evidence that ‘genioglossus activity may influence 
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Figure 9. Individual ^and averaged tokens for the spoken utterance "faz 
map.." The top row represents averages of 20 tokens. Four tokens 

' are shown 'beneath the average. The first two columns show EMG 

.output from the levator palatini, -after sampling and rectification 
before and after smoothing. The remaining columns show intraoral 
pressure, audio amplitude, fundamental frequency, and measured 
velar height. Haskins Laboratories. 
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pitch, and Erickson, Liberman, and'Niimi ( 1977) Tiave produced , the same sort of 
evidence for geniohyoid. The implication is that ajfyide variety of muscles 
may affect pitch, as Sonnineg__suggested' many years ago ( 1956)*T However, given 
the lack of- detailed information about secondary ^effects ' on vocal fold 
adjustment, only the three stt*ap muscles, the sternohyoid, the thyrohyoid, and 
the sternothyroid will be considered as extrinsics hereT ’ 

5 

", i 

Fundamental Frequency ‘ Control . Electromyographic studies on the regula- 
tion of pitch have been reported by many authprs. More recent 'electromyo*. 
.graphic studies have included those of Hirano, Vennard, and Ohala (1970), 
^Shipp and McGlone (1971), Gay, Hirose, Strome, and Sawashima (1972), and Baer, 
'Gay, and Niirni- ( 1976). ’ ‘ 



. These, studies all conclude that cricothyroid activity increases as the 
pitch is raised, at’ least over most of the pitch range, as we might have’ 

. expected froip the mode of action of. this muscle in producing torque around the 
cricothyroid joint. This action presumably underlies the observed lengthening 
of the folds with increasing f Q . * ^ ^ 

The activity of TA also increases as the pitch is raided over most of, the 
pitch* range, although it is more active in ,chest voice' than in falsetto 
(Hirano, Ohala, & Vennard, 1969; Hirano et al . ,. 19701 Baer et al... 1976), but 
^the function of this activity is obscure". The thyroarytenoid .could act, of 
’course, to- produce a shortening force in opposition . to CT, although this 
cannot be its primary function, since its activity .increases with pitch rise, 
rather than pitch fall. One theory, by van den Berg (I960), as to- its primary 
function suggests that it exerts "medial compression ," limiting the horizontal 
extent of vocal fold vibration, ^permitting the more effective pipy of 
aerodynamic forces.- An alternate possibility is that^ its tension is adjusted 
with compensating adjustments of CT,“ to tune the natural vibrating frequency - 
of the muscle itself, considered as a tissue mass, since the muscle makes up 
the bulk of the folds and so determines, in large part, their vibratory 
characteristics. A secondary problem in the characterization of,,' TA activity 
is that there is disagreement in the literature as to. whether there are 
functional or anatomical differences between lateral and medial (vocalis)’ 
parts of TA* so that an adequate description of the' function of one part may 
not suffice for the other (Sawpshima,' 1970). 

Reports on the other laryngeal adductors, IA, LCA, and the more lateral 
parts of TA, tend to show increasing activity with increasing pitch. Van den 
Berg (I960) suggested, on the basis of cadaver experiments, that the IA might 
be active without the laterals at .very low pitches, but thj.s possibility has 
never been experimentally verified. - ' , 

' t _ 

Some authors (e.g., Dedo , 1970'; Gay. et al.-,. 1972; Baer et al.,,1976) 
report increases 6f PCA activity at the highest f 0 t s when intensity is great, 
although there is not universal agreement. on this point (Shipp & McGlone, > 
1971). Although, this, muscle i/3 t normally an abductor , its activity at high f’ Q 
is thought to brace the arytenoids against the anterior pull of the vocal 
folds. The- observations of* Gay et al. are summarized in Figure 10. 



Control of f 



o by the extrinsic muscles-, of -the laryrjx is less ’ well 
■understood than control by the intrinsic muscles. The larynx, and f Q , move up 
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and down during singing by untrained singers, or during speech,* although 
trained singers learn to keep the larynx at ap approximately constant low 
position (Sonninen , 1956; Shipp & Izdebski , 1975). Tjiese . movements are 

produced largely by activity of the extrinsic attachments to the la'rynx, 

, especially by the strap muscles. 

Strap muscle activity (sternohyoid, sternothyroid) is correlated with f 
at both its highest and lowest levels. Although Kakita' and Hiki (Note 1) have 

* ^reported. differentiation among these muscles, the weight of the evidehce is 
' \that they act together in controlling pitch.. This finding is, supported both 

t»y electromyographic measurements ( Faaborg-Andersen & Sorpfn§n, 'i960; Baer et ‘ 
_ al., 1976) and by clinical observation of patients who have had these muscles 
sectioned (Sonninen, 1956). Although, on anatomical grounds, it would seem 
that the sternothyroid muscle ought to increase f Q by’ tilting the thyroid 
cartilage down and forward, and- that the thyrohyoid ought to decrease f Q by 
tilting the thyroid cartilage up and back, Sonninen showed that the situation 
is more complfex . In experiments with cadavers and in stimulation experiments 
with patients undergoing thyroidectomy, he found that the effect on the larynx 
of activity of these muscles depended on posture and head position. The 
sternothyroid, in particular, can tilt_ the thyroid cartilage either way. 

Sonninen developed an "external frame function" theory to account- for f 
raising, based on his own results and those of other investigators. According 
to this theory, all the strap muscles work in conjunction with the anterior 
suprahyoid muscles.. Although the strap muscles .may or may not raise the 
larynx, their main function is to pull the thyroid cartilage forward. At the 
same time, activity of the cricopharyngeus and downward pull of the esophagus 
exert a downward and backward force on, the posterior part of the cricoid 
cartilage / 

Since the mechanism for application of the "external frame function" 

theory - to f^ lowering has been elusive, alternative theories have been, 

advanced. One of these is the passive theory, stating that f Q /iarynx lowering 
i's due^ to relaxation of the mechanisms for f 0 /i ar ynx raising. Although 
. passive lowering can explain some of the observed relationships, two facts 
support^ the notion of at least an ancillary active mechanism. 

Electromyographic activity accompanies lowering as we noted above, and studies 
of yertical’ larynx position show that the position during low frequency 
phonation £'^ lower than that in rest position (Shipp & Izdebski, 1975). ’A 

secbnd theory, attributed to Ohala (1972), suggests that raising and lowering 

the larynx affeots f Q directly through adjustment of the vertical tension of 

• the vocal fold cover, which is continuous with the lining of the trachea. 
This theory cannot be adequately evaluated without improved understanding of 
the vibratory mfechanism of the' vocal folds and actual measurements of 
"yertical tension ( in raised-larynx ancl lowered-larynx . configurations. 
Finally, a theory accounting -for f Q lowering by laryngealization has been 
proposed by LindqvistV 1969) • This theory asserts that the vocal folds are - 
shortened- (-apd , incidentally, transglottal prg^ure is reduced) by activity of 

•.\the muscle fibers of the ar-yepiglottic sphincter. This mechanism does not 
appeah to require lowering* of the larynx' and hence does not explain the 
observed movements or associated EMG activity. It may operate jointly with or 
independently of other mechanisms. 
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Results of studies of strap muscle function in speech first suggested 
that although f Q falls were always accompanied by an increase in strap muscle 
activity, the activity did not always precede f 0 falls, and showed substantial 
effects of segmental variables (Collier, 1975; Hirano et al., 1969) . Later 
analysis, however, suggested that strap activity does precede pitch drops from 
a mid to low range (Atkinson & Erickson, 1977; Erickson et al., 1977). 

A problem in studying pitcTTcontrol in speech has been the difficulty of 
analyzing the relationships among f A, . sub^lottal pressur/ST^nd the antecedent 
activity of the large nunber of relevant muscles. One tecnnique, which has 
been found useful cross-correlates f Q and integrated EMg (Atkinson, 1978). 

The delay at wh^ch the correlation reaches a maximum can be used to estimate 
the response time of the muscle.) The magnitude of the correlation at this 
delay can then be used in estimating the magnitude of that muscle's contribu- 
tion to pitch control.. The analysis can be further refined by dividing the 
fundamental frequency range into subranges. AtkirTSbn's study shows the 
contribution of strap muscle activity to be greatest at low frequencies, while 
CT activity has its greatest effects at high frequencies. Although the data 
analyzed in the study were extremely limited , (further exploitation of the 
technique seems warranted. * 

There is, nonetheless, a limit to the amount of reliance one can place on 
the results of gross correlation studies. An ingenious new '-technique for 
studying the relationship of f Q and the activity of the various laryngeal 
muscles has been suggested by Baer (1978). The technique was adapted fr^m one 
originally designed for the study'of skeletal muscles (Milner-Brown, Stein, .& 
Yemm, 1973). Continuous records were made of electromyographic activity from 
laryngeal muscles and of voice fundamental frequency from a subject producing 
steady, sustained phonation at low f Q . The fundamental frequency record 
exhibits small perturbations around a nominally constant value. If we assume 
that these perturbations represent the response to' the firing of single motor 
units in those muscles that control pitch, then an average-rresponse computa- v 
tion^ of fundamental frequency triggered by single motqr unity- firing of any 
muscle should exhibit a systematic deviation 1 in the interval immediately - 
following the firings. Figure 11 shows - the results of following»this 
procedure for CT. Using this technique, muscles whose activity is grossly 
inter-correlated can be uncorrelated to examine their individual effects on 
som£ variable. We feel that tliis technique shows great promise in the 
application just suggested, and others. 



Stricture Control and Voicing Features 

a 

A second dimension of laryngeal adjustment in speech is stricture 
control, the degree to which the laryngeal- sphincter is closed by the 
approximation of the vocal folds. While these adjustments can be used to 

produce overall changes in voice . quality, most speech studies of this 
dimension have been aimed at understanding the mechanism of consonant voicing. 

Fiberoptic visualizations of the glottis (Sawashima, Abramson, Cooper , & 
Lisker, 1970; Kagaya, 197*0 show that voiced and voiceless consoriants are 
characterized by differences in glottal opening. It is the timing of J-jie 
a,bduction and adduction of the folds, relative to the movement of the upper 
articulators, that distinguishes consonant classes within and across 
languages. 
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Figure 11.* Single motor units of the cricothyroid, aligned and averaged, with 
parallel measure of pitch perturbation. See text for explanation. 
From Baer, 1981. t 
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Anatomicafly, the five intrinsic laryngeal muscles can be divided into 
three functional groups with respect to stricture control: adbuctor (PCA), v 

adductor (INT, TA, LAT), and tensor (CT). The question can then be asked 
whether the muscles function in speech in ways that the classification would 
suggest. Is there active abduction and adduction in voicing maneuvers? Do 
the adductors function together? Finally, is the activity of adduction and 
abduction accompanied by changes .in tensing? 

Abduction and adduction for voicing are clearly accomplished by the 
action of PCA and INT activity in a reciprocal vjay, as has been demonstrated . 
in a number of studies (Hirose & Gay, 1973; Fischer-Jrirgensen & Hirose, 1974; 
Hirose & Ushijima, 1976). 

Figure 12 shows a fairly typical pattern obtained for this pair qf 
muscles (Hirose, Lisker, & Abramson, 1972). The general conclusion is that 
the abductor (PCA) contracts, • the adductor (INT) relaxes. The relationship 
has been quantified. Hirose (1977) showed that for ? series of utterances 
containing voiced and voiceless stops, produced by k a'* Japanese talker, the 
value of the correlation coefficient ranges between -.85 and -.65. The 

analysis does not make it clear what variables affect the value in a critical 
way. 



The extent to which the activity of the adductor group is correlated in 
such maneuvers is still unclear. Some time ago, van den Berg and Tan (1959) 
showed, in ^cadaver experiments, that the different adductor muscles can be 
used to close the cartilagenous and membraneous parts of the larynx separate- 
ly. Thus, we might expect some differences between the activity patterns of 
INT on the one hand, and LAT and TA on the other. Such differences, have been‘ 
seen in studies of Korean stops (Hirose, Lee, & Ushijima, 1974; Danish stlfd 
( Fischer-Jdrgensen & Hirose, 1974) and glottal stops' (Hirose & Gay, 1 97 3 ) • 
Apparently, the activity of LAT and TA is connected to the necessity for 
strong medial compression in these productions. However, the detail effects 
of differential contraction of these muscles on the shape of the glottis are 
not known. Figure 13 shows the contrast in activity of INT and VOC (TA) for 

the three types of voiceless stop found in Korean. The important point to 

* note, .apart from the obvious overall differences, is that there is a sharp 
peak in VOC activity for the glottalized Korean stop at consonant release, 
probably associated with increased tension of the folds. / 

A recent experiment by Yoshioka (1979) also suggests circumstances in 
which we perhaps will observe differentiation among laryngeal adductors in 
stricture control. He found' that /h/\and/s/ may be produced with equal 
glottal widths, and equivalent patterns^o^Keciprocal PCA and INT activity, 
but still differ in the presence of vibratiori at the edges of the membranous 
portion^ of the folds in some examples of /h/ . An obvious possibility is that 
other intrinsic laryngeal muscles show differences in activity for stricture 
control for the sounds. 

A third question associated with the activity pf the vocal folds in 

voicing control is whether activity of CT is associated with abduction or 

adduction. Stevens' model of glottal activity suggests that the tension of 
the vocal folds will affect the likelihood of vibration, for a given pressure 
drop across the glottis.' It is therefore possible that some stops are 
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Figure .13, 



Averaged EMG curves for JKT and VOC for the t^ee bilabial stops of 
Korean. [phi] ^ voiceless and aspirate, [pK is voiceless anH 
slightly aspirated, and [p] is voiceless and gW tali zed. From 

Hirose, Lee, and Ushljima, 1974. C 
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Figure 14. Crioothyroid aotivity for the three bilabial stops of Koreap. ' The 
three ourves in eaoh box represent uttei^anoes containing the vowels 
/i/, /a/, and /u/. From Hirose, Lee, and Ushljima, 1974. 
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characterized by contrasts in CT activity, particularly those that contrast in 
degree of aspiration, like those of Korean (Hirose et al., 1974). A study of 
stop production in a single speaker (Figure 14) fails to support the 
hypotheses of CT differentiation, but small differences in CT activity 
accompanying voicing contrasts have been found from time to time. 

The brief summary of laryngeal muscle function in this^ section ahd the 
preceding one reveal that we nou have a gross • qualitative sketch of the 
activity patterns, and the technical means at hand to elaborate^his picture, 
to match models and observations of the larynx developed in other ways. 
However, we might now ask what clinical uses, might be made of EMG using 
presently available techniques. 

v 

ELECTROMYOGRAPHY IN FUTURE DEVELOPMENTS 

V 

At present, EMG is widely used in diagnosis of neuromuscular disorders. 
It has not been used this way for the laryngeal muscle's, although it perhaps- 
could be. For example, it seems possible to detect abnormal single motor unit 
firing patterns in these muscles, abnormal synchronization of motor unit 
firings (Hirose, 1977), or, perhaps, to differentiate peripheral neurogenic 
and myogenic disorders. 

% 

Another use, from my point of view a very exciting one, is to use EMG as 
.a technique for examining articulatory programming and its breakdown.. The 
work described in this paper, and others, can be used to show &■ very tightly 
time-constrained coordination of laryngeal and supra-laryngeal events in 
running speech. Aspects of this coordination appear to break down in 
stuttering (Freeman & Ushijima, 1978), and in apraxia (Freeman, B^nds, & 
Harris, 1978). While the broad perceptual consequences of breakdown in 
laryngeal coordination have often been ’ described (e.g., Darley, Aronson, & 
Brown, 1975), it seems far more direct to look at the underlying failures of 
.patterning. One of the most unfortunate consequences of the description of 
normal and abnormal speech in terras of transcriptional entitites has been to 
focus description of speech motor behavior on the attainment or failure of 
attainment of stationary acoustic or articulatory targets, rather than on the 
temporal prescription for coordinated activity. For normal speakers, we need 
to investigate what maintains these prescriptions, by systematically attempt- 
ing to disrupt them. For abnormal speakers, we need, first, to describe the 
disrupted speech in terms of the constituent .articulatory acts, and second, to 
investigate the relative roles of various factors, such as feedback, in' 
- maintenance of existing coordinations. v 
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INVESTIGATION OF THE PHONATORY MECHANISM* 

■» * * ' 

Thomas Baer 

« . 

X 

Abstract . A rational approach toward the development of improved 
techniques^ttrr the prevention, detection, diagnosis , .and cpfrection 
of vopal pathologies rests on an improved understanding of voice 
mechanisms. To achieve these ^oals, we need to better understand 
the dimensions of phonatory performance -and their, dependence both on 
the state of laryngeal structures and c«L«j?»ttqrns of control. 
Because of the inaccessible location of the larynx, few direct 
measurements of this performance are possible. Quantitative mathe- 
matical modeling is a useful vehicle for studying laryngeal vocal 
. function. Continuation and extension of excised- larynx and animal 
studies can provide detailed data in support of the developnent and 
testing of these models. Human experiments, _in vivo , aimed at 
-factoring out the phonatory consequences of .variationTTn individual 
laryngeal control parameters are - suggested as a means of further 
extending such studies. » 



. * INTRODUCTION 

A rational approach toward the development of improved techniques for the 
- prevention, detection, diagnosis, and correction of vocal pathologies rests on 
an improved understanding of voice mechanisms. For ppev&ntion,. we hope to' 
undfer stand £he pattern of control, and it's correlates in yibratory perforlT 
mance*, whose breakdown leads to physiological - failures in .• the ' laryngeal 
structures. Our research in detection and diagnosis is directed toward 
^isofcpting non-invasive' multidimensional measures capable of differentiating • ’ 
'• -Performance of larynges w&h different pathologies from the performance of 
normal larynges sftd from each otner . In the area of correction, we hope to 
improve the con ( cqptual frjmework for voice training and therapy, and improve 
the ability of^stifTjfebns ter predict the phonatory consequences' of alternative 
procedures. To achieve these goals, we need to better understand the 

dimensions of phonatory performance and their dependence both on the state a‘f 
laryngeal structures add on patterns of control. - J 

* , fr _ 

• . 

The process of phonation can be separated into three components": a " 

phonatory system, its inputs, and its outputs. The system consists of two 

subsystems: one aerodynamic .(the glottis), and the other, mechanical (the 





*A version .of this paper was presented at the Conference on Assessment of 
Vocal pathology, Bethesda, Md . , April 1 979 . (proceedings to be published in 
ASHA Reports .) 
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vocal folds). Inputs N to this system are muscular adjustments, transglottal 
-pressure, and some other less significant variables. Ouputs may be considered 
to be the pattern, of mechanical vibrations in the vocal folds, or, more 
significantly for tfoice production, the pattern of airflow into the vocal 
tract. This latter output then 'serves as input to another .system — the vocal 
tract — whos^ output is the radiated voice signal. • ^ 






The rayoelastic-aerodynaraic theory of phonation (van den Berg, 1958) 
accounts grossly for the nature of phonation in terras of a passive interaction 
between the two' phonatory subsystems When an appropriate combination of inputs 
is applied. The acoustic theory of speech (Fant, I960) accounts for the 
effects of the vocal tract in transforming the glottal source signal to a 
radiated acoustic output signal. Although both of these theories have been 
well known for two decades or more, there are significant details that remain 
poorly understood.’ Thus, we have only limited ability to estimate the glottal 
volume velocity waveform by oanceling the effects of the vocal tract from the 
speech output signal, and we have only limited ability to separate the 
influences of inputs to the phonatory system from the influences .of the system 
itself on detail of its output. Because of the inaccessible location of the 
larynx, few-direct measurements of this output are possible. 

Investigations into the mechanisms of phonation and its control have 
relied heavily on research* with models. Much basic knowledge can be derived 
from experiments with excised larynges (e.g., van den Berg & Tan, 1959) and 
with live animal preparations, which serve as simplified models of their 
intact counterparts but which can be more carefully observed and more 
systematically controlled. Fabricated mechanical models have also been used 
to test hypotheses about the mechanism. For example. Smith (1962) experiment- 
ed with a "membrane-cushion" model, which seems to incorporate some elements 
of the more recent "cover-body" theory of Hirano (1974, 1975, 1977). Mostly, 
however , mathematical descriptions and computer simulations have been used to 
formalize and refine knowledge about the mechanisms. Thus, the development of 
these models is both a goal and a tool of phonatory research. 

The history of these modeling efforts parallels the improvement pf our 
understanding of the system. As our understanding has become more complete, 
the models have become more complex. Building on the aerodynamic studies of 
van den Berg, Zantema, and Doornenbal (1957), Flanagan and Landgraf (1968) 
modeled the vocal folds as. a simple msss-spring system performing horizontal 
movements with one - degree of freedom. It soon became apparent that an 
additional degree of freedom was required to account for vertical phase 
differences . Ishizaka and Matsudaira (1972) corrected some errors in van -den 
Berg's aerodynamic analysis, and showed that a two-mass model of -the vocal 
folds could more realistically account for the conditions under which phona- 
tion could be initiated.* Ishizaka and Flanagan ( 1 97^ ) simulated the'two-mas^ 
model, extending the results of- Ishizaka and Matsudaira, but were limited by 
this model's inability to account realistically for the closed period of the 
glottal cycle.* .Titze (1973, 1974) increased the number of masses to 16, in 
order to allow a distribution of vibrations along the anterior-posterior 
direction. This ■■••model also allowed for some vertical movements. Finally, 
Titze and Talkin (1979) have been .investigating more sophisticated models that 
explicitly model the layered structure of £he ,vocal folds (Hirano, 1974) and 
their behavior as a vibrator, and that incorporate tissue viscosity and bulk 
incompressibility’: _ ' 

36 • . * • * . 4 
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Though it is understood that models must be complex to account realisti- 
cally for the phonatory mechanism, there is also a danger inherent in the 
growth qf complexity. As the number of degrees of freedom and the number of 
independent parameters multiply, the possibilities for accurately modeling the 
detailed mechanism improve, but so do the possibilities for producing appar- 
ently realistic behavior due toi mechanisms that may not represent those of the 
real larynx. For our purposes, models must be mechanistically correct as well 
as descriptive of 'the output. It is therefore essential to determine as many 
of their parameters as possible and the constraints among them by direct 
measurement, and to evaluate the performance of these models in the. greatest 
possible detail. Furthermore , we ought to be able to make directly testable 
prediptipns on the basis of our modeling efforts. 

Further progress in understanding the detailed mechanism of phonation and 
iq developing an accurate model of it thus depends on detailing the mechanical 
character isitics of vocal folds and determining their variation as functions 
°f— laryngeal control. It also depends on improved methods for measuring more 
detailed performance characteristics of real larynges, for comparing model 
performance to the performance of real larynges, . and for" generating testable 
predictions from modeling studies. Hirano has discussed, both at' the 
Conference on Assessment of Vocal Pathology and* in other publications (-Hirano, 
1975 , 1977 ), measurements 1 of" mechanical properties of the vocal folds and some 
patterns of their variation with the contractions of individual muscles. 
Other papers at the conference will discuss techniques for obtaining detailed 
measurements, and Titze's paper will discuss methods for comparing the 
performance bf models with these measurement! on in vivo larynges. In the 
remainder of this paper, the continuation and extension of excised larynx and 
animal studies i£ urged because of their ability to produce detailed data for 
the direct testing of models. Then, some .experiments _in vivo , aimed at 
factoring out the phonatory consequences of variations in individual control 
parameters, are suggested as a means of further extending these studies. 

I. EXPERIMENTS WITH EXCISED LARYNGES AND ^NIMALS ' 

It is well known that ^excised larynges, both canine and human, can 

simulate many of the vibratory characteristics of normal human larynges when 

they are attached to a pseudosubglottal system that supplies suitably conditi- 
oned airflow and when the positions of the laryngeal cartilages are suitably 
controlled, using striftgs to simulate the functions of muscles. , As a 
simplified model of their intact counterparts, excised larynges offer several 
advantages. Because they are more accessible, they can supply observations 
and measurements that cannot be made iji vivo . For example, both Matsushita 
( 196 . 9 ) and Baer ( 1975 ). have developed techniques for observing vibration 
patterns both from the normal supraglottal aspect and from the subglottal 

aspect. Baer also developed a technique for marking the s »vocal folds with 

small particles and tracking their frontal-plane movement trajectories 
throughout a. glottal cycle using a microscope and stroboscopic illumination. 
Measurements could be made from both tjie- 'supraglottal aqd sdbglottal aspects, 
and with the aid of qualitative observations.^ vodal fojd' 'shapes ’in the frontal 
plane throughout a cycle ' could Jbe reconstructed from the measurements . With 
excised larynges, measurements .of* subglottal fJressure and glottal airflow can 
be simplified.' Fuhthermor^ almost 'any technique for measuring character is- 
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gure 1. Schematic diagram of. apparatus for measuring vibration patterns of 
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tics of phonatory vibrations can be used more effectively on an isolated 
. larynx. Additional advantages are that the configuration of an excised larynx 
can be held constant or t systematically varied, that its structures can be 
experimentally modified to determine the effects on vibration, and that they 
are accessible for measurement of mechanical properties in their configuration 
for voice production. The major limitations of the excised preparation — 
namely, that its death changes some of its mechanical properties, including 
its ability to tense the vocalis muscle — can be overcome by using live animal 
preparations and^stimulating the muscles electrically. However, these advan- 
tages have not be&n fully exploited. 

✓ 

Baer's work with excised larynges was directed toward elubidating the 
phonatory mechanism in excised canine larynges. Although there is not space 
here to describe these experiments in detail, some of the most significant 
results are summarized below. 

■* j ) 

The experimental apparatus is shown schematically in Figure 1. .A'larynx 
was mounted on a pseudo-trachea, which made a right-angle turn just below the 
larynx, allowing, a window to obtain a subglottal view. A stroboscope 
synchronized to sub&lottajl? pressure variations was mounted in front of the 

preparation. The phase at which the stroboscope was triggered could be 
adjusted to any point within the glottal cycle. Airflow was delivered at 

regulated flow rate or pressure, and both average 'pressure and average flow 

rate were measured. The subglottal system was intended to annulate the 

acoustic properties of the real subglottal tract. The apparatus was mounted 
orKJthe top of a rotary indexing table, whose tabletop could be rotated, so 
that observations could be made through the microscope at any. an^le. The 
tabletop could also be translated along its two horizontal axes. A measure- 
* ment system was devised by which thdlocations of any points observed through 

the microscope could be determined in three dimensions. 

a 

With respect to gross aspects of the performance of excised larynges, 
observations ^already made by others were replicated. In addition, it was 
observed that, for a given laryngeal configuration, phonation could be 
, maintained at values of subglottal pressure below those required for initiat- 
ing phonation. As the tissues desiccated, the separation between conditions 
for onset and conditions for maintenance increased. Thus, mobility of the 
surface tissues appeared to be important for initiating phonatory vibration. 
Perhaps this observation has some implications for the assessment of patholo- 
■ gies. 

“ f 5 

Figure 2 shows data from a run in which the frontal-plane trajectories of 
three particles were measured at eighth-cycle increments while the larynx' 
sustained steady-state vibration. One particle was on the lateral superior 
surface of the vocal folds, a second was hear the medial superior surface of 
the folds, and a third was on the lower (subglottal) surface. These 
trajectories are typical. They were roughly elliptical, in the clockwise 
direction (for the coordinate system shown). The minor axis of the ellipses 
decreased as average distance from the midline increased. Subglottal parti- 
cles moved primarily in a horizontal direction, while supraglottal particles 
well off the midline moved primarily in a vertical direction. Trajectories of 
particles near the midline often exhibited complex perturbations near the 
v superior-medial parts of their trajectories. Trajectories of the two upper 
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Frontal-plane trajectories of three particles during a single 
glottal cycle. Measurements vere made at eighth cycle increments, 
numbered 0 through Ti^JThe inset to the right of the trajectories 
contains notes about the measurements, .including the angle, e, of 
the tabletop for which each measurement was made. , The schematic 
sketch at the top of the inset indicates the particle locations 
with respect to the margin of the vocal fold . 
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particles crossed, so that the particles were nearly vertically aligned during 
one measurement and horizontally aligned during another. Thus, the vibrations 
were complex. Some aspects of the trajectories and of vibrations in general 
were consistent with ' the notion of a displacement wave, progressing up the 
medial surface at a velocity of about 1m/ sec, and then progressing laterally 
, on the superior surface at .3-. 5m/ sec. The supraglottal wave was easily 
observed, as with normal human larynges, and its velocity was measured 
directly. Glottal closure al&> exhibited wavelike properties. Tissues at the 
lower edge of closure were, peeled apart, while tissues above the point of 
closure were still coming together. The depth of closure was often almost 
negligible immediately befoVte the glottis opened. The middle particle in 
Figure 2 appeared to be on Ittte superior part of the vocal folds for part of 
the cycle, and was below thje pqint of closure for part of the closed phase. 
Thus, it is evident that! the '^ration's are complex and cannot be well 
modeled, in detail, as simple translations of a small number of lunped- 
pararaeter masses. \ 

Although some assets oj the vibration patterns seemed best describable 
by surface waves along the cover of the vocal folds, vibrations oirthe. edge 
also appeared to be describable as string vibrations (that is, whole-bo^y 
transition and torsional flexure) . There may have been components of both 
types of vibrations. This interpretation is interesting, because interactions 
between the two types of vibration as a function of variations in control 
p^taeters may help to explain fine control over voice quality variations. 

Detailed shapes of the vocal folds during the eight phase increments in 
Figure 2 were estimated and are shown in Figure 3* A two-mass model 
approximation could be superimposed on these shapes if vertical movements of 
the masses were allowed. Given this approximation, the aerodynamic theory of 
Ishizaka and Matsudaira (1972) was capable of reconciling average subglottal 
pressure with average flow rate.. It was also shown, as expected, that the*? 
aerodynamic model provided for the efficient transfer of energy from the 
aerodynamic system to the mechanical system (StiSv.ens, 1977), given the nature 
of vertical phase differences. The mechanical parts of the v two-mass model did 
not well account for these- data, hdwever . -Thus, to the extent it could be 
tested, the aerodynamic aspect o^ the two-mass model seemed accurate, but the 
mechanical part of the ‘r&odel seemed inadequate. 

- V 

A change in particle trajectories was observed as the tissues desiccated 
and vibrations eventually ceased. These and other measurements suggested that 
particle trajectories could be considered as oscillations, around an unstable 
equilibrium position. This result implies that small-signal modeling techni- 
ques, such as' those of Ishizaka and Matsudaira (1972), which account for voice 
onset by finding unstable solutions to linear equations, are justified. "> 

Excised larynges were able to produce nearly normal vibrations even when 
the • vocalis- muscle on one, or .both sides_ was completely removed. However, 
these preparations did not seem capable of falsetto vibrations. Wave motions 
with vel.ocity similar to that of the normal case were still seen to propagate 
upward on the medial wall. Particle trajectories were somewhat similar to the 
normal case, although, they differed ^in some details. These observations 
shbuld be especially useful for testing models that account for the layered 
structure of the vocal folds. , 



/ 



The experiments described above illustrate the potential value of devel- 
oping a model specifically for excised larynges, as a step in developing a', 
model for the i n vivo case. An advantage to modeling the excised preparation 
explicitly is not only its versatility, as illustrated by the experiments wj^th 
excised vocalis muscles , but also the fact that measurements of mechanical 

properties can be made on the same preparation on which the vibration patterns 
are measured . . f T — 

Optical techniques for measuring frontal plane vibration patterns, such 
as. those used bv Baer , are limited because they are time consuming and because 
only 'vibrations of the vocal fold surfaces can be measured.. Radiographic 
techniques may provide a solution to the problem of measuring vocal fold 
shapes throughout a cycle. There have been some radiographic studies of vocal 
fold vibrations in vivo . Sovak, Cdurtois, Haas, and ariith ( 1 97 1 ) described a 
high-speed radiographic study capable of resolving the .details of a glottal 
cycle. Hollien, Coleman, and Moore* (1968) developed the technique of 
stroboscopic laminagraphy\ in which an x-ray source^is pulsed stroboscopically 
during a laminagraphib ''proc edur e . For steady phonation, images of a frontal 
section could thus be obtained at successive phases within a cycle. The 
usefulness of these studies was limited by! the poor quality of the images 
obtained. Furthermore, they may be no longer practical, in view of modern 
concerns about radiographic dosage, especially to the thyroid gland. However, 
suci techniques could be applied safely and more effectively to the study of 
excised or^ animal larynges. A promising improvement on these techniques was 
recently described by Saito (1977) and Saito, Fukuda, iOno, and Isogai ( 1 978 ) . 
Small lead pellets were affixed to the vocal fold surfaces and-.also implanted 
within the vocal folds, so that both internal and external vibrations .could be 
monitored. ‘Stroboscopic radiography, synchronized to the voice, was then used 
to track the movements of these particles throughout cycles of vibration. * 
Such measurements might be made even more effectively with, a computer- 
controlled x-ray microbeam systgra (Fujimura, Kiritan'i, & Ishida, 1.97<3 ; Kirita- 
ni » 1977), if its detector output were stroboscopically sampled or its source 
stroboscopically pulsed, bedause of the improved spatial Resolution'' of this 
device. Conceivably, radiopaque medium could- be introduced • thrqugh the 
Lrculatory system, as a further improvement of this technique. , v . . 



II. MEASUREMENTS IN VIVO; RESPONSES TO INDIVIDUAL CONTROL VARIABLES ' •' 

* * • 

There are many parameters controlling phonatipn' in the normal .human 
larynx. 'Control is exerted most directly through the effects bf t*he intrinsic 
rausdles on laryngeal configuration and through transglottal pressure^. Forces 
exerted by the extrinsic laryngeal muscles and other extrinsic 'structures al-so 
have -an effec.t, Acoustic load can modify the patterns of airflow through the 
glottis and probably the mechanical vibrations as well. There are probably 
other effects, such as contol of vascular and mucous supply, which are less 
well understood. During voluntary control of phonat'ion, variations in several 
of these parameters are int^rcorrelate* (see, for example, Atkinson, 1978). 
Although such variables as the levels of electromyographic activity in 
individual mudcles and subglottal pressure can be correlated with correspond- 
ing ohanges in fundamental frequency or otfiSr aspects of phonafcory perfor- 
mance, correlation does not guarantee causality, because of the intercorreia- 
tions among control variables.^ , Therefore,' it has been difficult to isolate 
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the detailed phonatory response to any one of them. Nevertheless ^ these, 
detailed effects must be known in order to determine the relevance of data 
from excised larynx 1 and animal experiments, to adeqi^tely test detailed 
phonatory. models, and, in general, to fully understand, phonatory function. 

f 

One method for isolating the effects of a given parameter is to 
externally apply involuntary perturbations and observe the phonatory response 
while other parameters remain constant. This technique has been most success- 
fully used for %xaminffig the effects ^f changes in subglottal pressure on 
fundamental frequency. Several experiments he^v* been reported in "which 
X subglottal pressure' is increased by a sudden push on the chest or abdomen of a 
Iphonating subject, and both subglottal pressure and fundamental frequency are 
monitored during an interval for which no muscular response is' assumed to 
' •occur (for example,, van den Berg,' 1957; Isshiki, 1959; Ladefoged, 1963; Ohraan 
& Lindqvist, 1966; Fromkin Ohala , 1968). This experiment was recently 
replicated by Beer ( 1979 ) r who also monitored the electromyographic activity 
of laryngeal muscles to ensure the absence €ff a response. Transglottal 
pressure can also be varied supraglottally v through modulation* of intraoral 
pressure (Lieberman, Knud son, & Mead, 1 96^3; Hixon, JCLatt, & Mead; 1971; 
Rothenberg & Mahshie , 1977). When pressure modulations are oscillatory, at 
frequencies of about *6-1 0Hz, continuous muscular compensation does pot seem to 
occur, although EMG evidence to support this claim has not been published. 



/ Although results of these induced-pressure-change experiments differ in 
some^details, their consensus indicates . that fundamental frequency varies with 
transglottal pressure ^t rates of about 3~5Hzcm withi'h the speech 't'&nge, 
with, higher rates at higher fundamental frequencies ° or in falsetto Register. 
These » results , as well as correlation between fundamental frequency and 
subglottal pressure during voluntary control (Atkinson, 1978), suggest that 
the phonatory response to pressure change is fast, perhaps within the interval 
of one or two glottal periods. ; 

- * ^ " 

^ " < Cr 

The effects of involuntary perturbations in acoustic load on fundamental 
frequency have also been investigated through systematic variation in the 
length of a tube that artificially extends the vocal tract Jlshizaka, 



Matsudaira, & Takashima, 1968;” 



shizaka & Flanagan, 



1972). 



Changes in 

fundamental frequency of asjntfch 'as-,'20Hz w ^i&k btained by varying^the length 
of the tube. JHowever^^it was not deten^'idEr in these experiments “whether 
there -was any compensatory laryngeal responpJJr It is easily dhown that such 
artificially ijjefeased aqoustic loads can have an effect on phonation./ If one 
phonates an^ascending scale. into an artificially extended vocal tract K such as 
a maii-ing tube), the voice will typically break or switch to„ f alsetteJwhen the 
^furiciam entail frequency nears the first resonance frequency of the tract; A 
lower order manifestation of' this phenomenon 'might account for the intrinsic 
pitch of vowels (Peterson &' Barney, 1952). In' any case,' such experiments 
could be repeated more carefully to further constrain- the performance of 
phonatory models., • . . ... 






The logical counterpart to these studies for quantifying the effects of 
individual muscles on phonatory performance would probably require electrical 
stimulation of the muscles. There are no accounts of any such studies on 
normal human subjects, and it is unclear whether stimulation experiments are 
possible in practice. However, an alternative methbd, yhich isolates the 
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effects of single-motor-unit contractions;. hag recently been used by ‘Baer 
(1978) for investigating the effects of individual .^muscles on fundamental 
frequency. Rather than analyzing gross aspects of fundamental frequency 
control, this method" relates Very small changes, in fundamental frequency 
_ (namely, pitch perturbations) to very small changes in muscle tension, which 
can be related to single-motor-dnit 'activity. Statistical ind^pen^encfe 
between motor-unit inputs can then be exploited* to uncorrelate the muscles, 
and examine their individual causal effects on fundamental frequency. 

^ ♦ * 

This method, extends the use of an averaging technique that yas * first 
developed for studying properties of single motor units in skeletal muscles 
(Milner^Brown, Stein, & Yemm, 1973).* Single-motor-unit action potentials (see 
Harris, 1981) must be identified in ’an electromyographic recording .while; the 
muscle sustains a contraction. A , simplified muscle model, .which is approxi- 
mately valid at low to moderate levels of .contraction, is assumed. This model 
is shown in Figure 4. -Its inputs are 'the action potential trains from 
individual ,motojieurq.hs . Each jof these \can be considered a 'random point 
process, and^they are statistically independent across units.' Each motor-unit 
action potential triggers a mechanical twitch — a " positive \ pulse of tension 
whose detailed characteristic^ vary across motor units.', At least some of 
these units fire at low. enough hates .so that adjacent twitches do not overlap. 
The output tension of the whole muscle is "’the (.summation of its constituent 
motor unit outputs. Although many of £Ti.e motor* un’l^chjt puts are trains of 
pulses, „they sum to an approximately constant," thougl$£no\sy , value because 
^they are ^qta tfoticaSlv independent. 'The relative amprffoSe of thisvnoise 
^ends on^^^nu^iber. bi* mo'torhunats and their, firing rates; .. * , • 
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§ — Given^the- model in., Figure 4.., the contribution of a *single motor unit to 

■£he output tension, (its contraction properties) qan be estimated if its input 
action potentials .can- beVidenti fled and if these inputs, are isolated by 
intervals great enough to eijsyrevagainst overlap of adjacent contractions. 
Samples of the output* tension /wavefdjgmj. fallowing the input.%. are aligned and 
averaged. The output ^of the isolated fiot£r t units is aTteeys the same within 
these intervals, while” the outputs oft- hll othar motqr^ ^un’i.ts are random and 
thus average to a constant value./ ‘ \ V .-4 * \ 



% 



<>■ 



To apply this technique ,to investigation * of /fundamental- frequency ’-con-* 
trol* we note that motor-unit firings* are statistically"? independent adhoss 
muscles as well as within a muscle^ We then hypothesize "that muscle-tension 
variability contributes to the fundamental fVequ^ncy^perturbdtions that can'll 
measured when a normal phonating ^subject attempts to^usfcain a steady tone. 
The resulting model for pitch perturbations is they * indicated in Figure 5. 
‘Laryngeal muscles produce roughly constant output , "tensions ( that are noisy 
because of single-unit' effects. The noise com points’* .across muscles are 
uncorrelated. • The .complex, effect of muscle fterces.'on the. Vocal folds, which • 
we have lumped under the ; term, "vocal '•old tension *a|jso roughly contetant, 
but noisy. Output fundamental * frequency then dep^gds^mi this tension and 
other independent inputs such as subglottal pressureNIgd , perhaps, mucosity ’ 
ahd other random* effects. All ’the detailed inputs to this model are thus 
- statistically independent. According to the model, ,thpn, fundamental frequen- 
cy as a function of time can be treated .as an output "and be* averaged just as 
muscle tension in earlier studies to estimate the effects of single-motor-unit 
contractions in that muscle. The effects 'of other muscles and other inputs 
• average to a constant value. 
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SIMPLIFIED MUSCLE MODEL 




Figure 4. Simplified model of a muscle during a sustained contraction. 
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MODEL FOR 
PITCH' PERTURBATIONS 




Figure 5. Model for pitch perturbations during produc|fl.on of a steady tone. 
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To obtain data for such a study, a subject is asked to sustain a steady 
tone for several breaths. Electromyographic (EMG)' activity, obtained “through 
hooked wire electrodes from a laryngeal muscle under 'study, and the voice 
signal obtained- through a standard microphone pre recorded and input to a 
digital computer . After instantaneous fundamental frequency as a function dt 
time is derived, this waveform is offset by approximately its average value 
and amplified to exaggerate the perturbations. Isolated single-raotor-unit 
firings are identified in the EMG waveform. Then, samples qf the EMG waveform 

- and the F 0 perturbation waveform are aligned around the single firings and 
averaged. The sample window extends from 100ms before to 300ras after these 
firings. 

Figure 6 shows a 1.5s sample of data when the raygcle un.der study was the 
cricdthyroid , whose function as a vocal— fold tenser and li^fnce as a pitch 
raiser is well known. Fundamental frequency was about 100Hz, -which i-s in the 
lower part of the subject's range, in order to keep the number* of recruited 
units and their firing rates. low." As this figure shows, fundamental frequency 
was estimated to 1 Hz resolution. Although c.ycle-to— cycle variations i^arely 
exceed 1Hz, perturbations over larger time intervals were about 4Hz wide.. Two 

- firings havfe been isolated in this record, and the corresponding sample 
intervals are indicated by horizontal lines. 



.Figure 7^ shows the results of the. averaging, 'calculation for this’ 
experiment after 19 suitable firings were identified/ The upper panel shows 
the averaged DIG signal, which exTiibits a pulse only at the lineup point, as 
expected. The lower panel shows the average* F 0 perturbation. This signal is 
*■ approximately at baseline both to the left of the lineup point and to the far 
Tight of the window. However, there is a positive pulse beginning immediately 
after the lineup point. This pulse reaches its peak -amplitude of 1Hz at a 
latency of about 70-80ms. The pulse appears to indicate that the single-raotor- 
unit contraction caused, on the average, a 1Hz increase in fundanental 
frequency. 



A similar calculation was performed for one of the strap muscles, an 
extrinsic laryngeal muscle whose -possible function^ in lowering F 0 has been a 
source of some controversy. When fundamental frequency iras in the middle of 
the subject's range, no systematic .effect was found. Results wh® the 
fundamental frequency was low are shown in Figure 8*. . Although these iata are 
somewhat noisier than those in Figure 7, they appear .to exhibit anegative 
pulse in the interval immediately after the lineup ^oint. Thus, the strap 
muscle is shown to have a causal effect in lowering- fundamental^frequency from 
an already low level. , ^ 



i °f a ”“f ular contribution to F 0 perturbations is itself 

interesting, since perturbations have been used as an indicator of vocal 
pathology. These results show that care must .be taken when interpreting 
patterns of pertuphation . More relevant to this discussion, however, is the 
fact that we cap/ show the response to a. .short duration pulse of tension in a 
single musclp'', . and that these data can thus be used to constrain the 
performance of laryngeal models. it was noted that the average pitch 
perturbation for the cricothyroid muscle beg4ns immediately after the lineup 
point. This- shows that the phonatory response must begin within one glottal 
cyple. The latency of the peak of the response,- 70-80ras, includes contribu- 
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. Figure 6. Short segment of data duriny production of a steady tone at about 
100 Hz. Top: voice “ waveform; Middle; EMG activity of the 

cricothyroid muscle; Bottom:, "instantaneous fundamental frequency" 
extracted from th^e voice waveform, tfwo sets of horizontal lines 
indicat^^intervals from 100 ms before to 300 ms after single-motor- 
unit firings in 'the' cricothyroid muscle. 
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CRICOTHYROID MUSCLE 
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ft = 166 HZ 

*> 






RAM EMC* ALIGNED AT SINGLE FIRINGS AND AVERAGED 
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Figure 7. Ensemble- aver age waveforms of ^MG activity from the cricothyroid 
* muscle and corresponding instantaneous fundap-ental frequency. All 

waveforms haver been aligned at the time of a single-motor-unit 
firing for purposes of averaging, •- - , - . * v 
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Figure 8. Ensemble-average waveforms of EMG* activity from an. '•.unspecified 
strap*muscle and corresponding instantaneous fundamental frequency.- 
“All .waveforms have been aligned at the ^ime of- a singie-motor-unit 
firing for purposes of averaging. ' * 
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tions due to muscle contraction time, mechanical response latency in the 
larynx, and latericy of phonatory response. Since both the latency and the 
amplitude of the mechanical motor-unit contractions can be. estimated in animal/ 
experiments, these data might be further applied to *the detailed testing/)? 
models of laryngeal performance, especially in comparison with data reported 
by Hirano (1975) relating' changes in shape and mechanical properties of vocal 
folds to stimulation of yaricwfs muscles. -These, data might also shed some 
further light on the pattern of motor control. For example, the relatively 

large amplitude of the F« perturbation pulse in Figure 7 relative to the 
overall perturbation in Figure. 6 suggests that very few motor units were 
firing at rates low enough to show the effects of " individual twitches. 
However, it is unclear how many 'other units may have been in tetanus. Perhaps 
the greatest value of the single-unit technique will be in elucidating the 
phonatory fuhction.of muscles such as the vocalis, whose gross patterns of 
activity are so intercorrelated with those of other muscles during ongoing 
regulation of phonation that their detailed effects have remained obscure. 



In considering the function of individual control parameters, in this 
section, we have only discussed measurements of their effects on fundamental 
frequency. -The reason -for this is that, with few exceptions' these are the 
only measurements that have ’been made. Fundamental frequency by itself,* 
however , is evidently not a very complete descriptor of phonatory., activity. 
As fundamental frequency is varied, attributes of the vocal source waveform 
that contribute to intensity and. voice quality also vary. It is important to 
determine how these parameters covary when changes are produced by different 
control mechanisms, and, for purposes of assessing vocal pathology, how these 
relationships change in different pathological states. 



Techniques to be^discussed in today's session can be used to measure some 
of these different parameters of phonatory performance, such as amplitude of 
the glottal pulse and open quotient. When these parameters are-measured 
cycle- to-cycle , the same techniques described in the section- for studying 
fundamental frequency control can be utilised to assess the effects of 
different control parameters. These data, .together with such anatomical and 
physical studies as those reported by Hirano (1975) , are needed to improve our 
understanding of’ the phonatory mechanism and constrain the performance of 
mechanistic, models. Thus, these studies should be pursued. Furthermore, if 
it were possible, it would be even more, useful to study 4 not only changes in 
vibratory performance characteristics as a function of these control parame- 
ters, but also intermpdiaier^variables such as the positions of the larynge'al 
structures and their mechanical-'properties . However, these experiments must 
await the development of techniques for measuring these parameters. 



Finally, further insights are deeded into the detailed conditions neces- 
sary for- initiating and sustaining phonation, as well as, fi|r regulating 
ongoing phonation. Jin example of how. such studies might be pertormed in vivo 
is by using involuntary perturbations of subglottal pressure. Fo^ example, a 
subject might be asked to assume a configuration appropriate for voicing but 
to maintain subglottal pressure at a level below the threshold for voice 
on^et • Transglottal pressure might then be suddenly increased, say using a 
chest push procedure, td ‘a level for which phonatory vibrations ar'e initiated, 
while laryngeal configuration remains constant. Conditions for voice onset 
could then.be determined, in terra b of the level of subglottal pressure 






function of variations in the configuration. With negative transglottal 
pressure perturbations, conditions for voice offset could also be studied. 
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PHONETIC PERCEPTION OF SINUSOIDAL SIGNALS: EFFECTS OF AMPLITUDE VARIATION* 



Robert E. Remez,+ Philip E. Rubin, and Thomas D. Carrell++ 






Abstract . Naive subjects, when instructed to listen for a sentence, 
are capable of transcribing -the phonetic message of acoustic signals 
consisting sol*ely of time-varying sinusoids.' These unnatural- 
sounding signals mimic the pattern : of formant center-frequency and 
amplitude variation over the course of polysyllabic,, semantically 
normal utterances. To what extent does amplitude variation over 
time contribute to intelligibility? Our' present investigation 
tested the hypothesis that listeners derive, some information a^out 
syllable patterns from amplitude Variation alone, and may therefore 
use contextual constraints to deduce prosodically appropriate 
portions of the “message in the tonal stimulus. Phonetic and 
syllabic intelligibilitjjlrwere compared in four conditions: (1) 

normal amplitude and frequency variation; (2) normal frequency, 
variation with constant amplitude; (3) normal frequency variation 
with a misleading amplitude, contour; and (4) normal amplitude 
variation with no frequency variationV- These results are discussed 
in the ’framework of phonetic perception and in terirfs of current 
theories of the. perception of • fluent, speech. 



Talkers make sounds for listeners to hear. This truism has implicitly 
motivated many Jpresent explanations of speech perception. Essentially, these 
.explanations have sought . to enumerate the perceptually critical acoustic 
elements produced by talkers when generating phonetic sequences. Researchers 
have- ysed the ability to synthesize speech to fashion acoustic signals 
containing only those acoustic components of natural utterances believed to be 
necessary for perception. In doing so, we t have made highly refined and 
specific descriptions of the stimuli that elicit phonetic .perception. In 
complementary research, studies of the 'auditory periphery, of the basilar 
membrane, cochlear nucleus and auditory- projection have permitted. us to learn 
how the critical acoustic elements survive auditory transmission. But, 

♦ 1 ’ ’ l 
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regardless Sf the differences among the many approaches to studying phonetic 
perception, all, approaches have assumed that the stimuli for phoneti<j percep- 
tion consist necessarily of the kinds' of sounds produced by v a variably 
excitable, variably shapable tube-resonator — the vocal tract. 1. "• 

A recent demonstration of ours questioned the assumption that the 
perceiver re<fuires ( phonetic stimuli to comprise, however selectively, acoustic 
elements found in natural utterances (Remez, Rubin, Pisoni,' & Carrell, 1981). 
In a raising this question, our study also* challenged .the ’assumption that 
phonetic perception is based simply on a succession of discrete acoustic 
’elements. In this study, we used a signal •consisting of three time^varying 
sinusoids, each of which varied in a way that a formant peak jnight vary over 
, f. thd course of an utterance. Initially we fabricated the Sinusoidal pattern by 
V<5omputing the resonant centeri-frequencies of a natural utterance, using Linear 
►.Predictive Coding (see Figure 1). The table of values produced through this 
analysis was used to set frequency and amplitude parameters of a sine-wave 
synthesizer. Figure 2 shows the differing short-time courier spectra of 
natural* synthetic (OVE and Haskins Pattern Playback), and -^ine-wave". signals. 
Note the absence of a fundamental frequency, harmonic speoAum, and broadband 
formants in the sinewave signal. Lacking" these acousti™ attributes, the 
sinewave spectrum does not resemble the spectrum of a natural signal, in any 
literal sense. However, there ^^s energy, albeit infinitely narrowband, at the 
computed peaks throughout the duration of the pattern; and, the time-varying 
properties of the sinewave pattern, specifically the coherence of the changes 
of the energy peaks over time, replicate the natural case. 

' 4 

The perceptual effects of sinewave stimuli were easy to predict.* Because 
the short-time spectra of three-tone signals differ drastically from natural 
and even synthetic s’peech; because no talker is capable of producing three 
simultaneous "whistles" with these bandwidths, in this frequency range; and 
because the < frequency and amplitude variation of the three tones is not 
synchronized, the perceiver should hear three independent stress, one for 
qach sinusoid. The perceiver should hear no phonetic qualities. 

< 

However straightforward *this prediction seems, ther^" was a second, 
contrasting prediction. Suppose that the listener is able to disregard the 
short-time differences between sinusoidal signals and speech, and can. attend, 
instead, to the overall pattern of change of the three tones. The patten of 
change of the frequency peaks resembles the resonance changes produced by a 
vocal tract articulating speech. If the listener can apprehend this coherence 
in the time-varying properties of the ri'onspeech signal, then he should hear a 

phonetic message spoken by an impossible voice. - 

<* 

Given nonspeech stimuli whose time-varying properties are abstractly 
vocal, listeners perceived the signals in both of the ways we predicted. 
Those listeners who were told nothing about the stimuli heard science fiction 
sounds, bad electronic music, sirens, computer bleeps and radio interference .2 
Those listeners who instead were instructed to transcribe a "strangely 
synthesized English sentence" did exactly that, for the most part — they 
identified the radically unnatural "voice" quality .of the patterns, but they 
transcribed those patterns as they would have the original natural utterances 
upon- which we based our sinewave stimuli. 
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Figure 1. Sinewave stimuli are produced by imitating the time-varying proper- 
ties of the center frequency and amplitude of the first three 
formants in a natural utterance. 
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Figure 2. A comparison of the Fourier spectrun of four complex waveforms^ 
(A) natural speech; (B) synthetic speech .produced by the OVE 
synthesizer; (C) synthetic dpeech produced by the Haskins I^bs 
Pattern Playback; (D) waveform consisting of three sinusoids.. 
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• This finding was novel i'n at least two wa^B. (1) It extended research on 
phonetic perception of sinusoidal signals to a high uncertainty judgment task, 
by offering unrestricted response alternatives. Previous tests of sinusoidal 
patterns had used forced-choice identification tasks wi-th small response sets 
(Bailey, Summferfield, & Dorman, 1977; Best, ^Morrongiello, & Robson, 1981; 
Cutting, 1974; Fant, 1959; Grunke & Pisoni,. 1979). Subjects' performance is 
obviously stabilize d ^i n such circumstances. However, we showed that the 
intelligibility of sinusoids does not depend on extensive training with 
simple, schematic -stimuli, nor on test procedures that intrinsically promote 

consistent performance. . * 

, • , * 

(2) More generally, the study indicated that speech perception is 
possible despite drastic departures from the short-time spectra of natural 
speech — despite absence of broadband formants, harmonic spectrum, and funda- 
mental frequency — insofar as the 1?ltne-varying properties of speech signals ahe 
preserved; and, insofar as the listener is able to attend to the coherent 
time- variation of the acoustic pattern. Both of these general qualifications 
must obtain for phonetic perception of sinusoids to occur, for the listeners 
who were not directed to expefet speech for the most part did not spontaneously 
hear phonetic sequences in the tones. » . 



The present investigation is directed toward questions that arose from 
our initial research with perception of sinusoidal replicas of fluent, 

1 semantically ordinary utterances. Primarily,' we. noted .that the tonal’ patterns 
could well' be qpnsidered an extreme case_, of defective acoustic-phonetic 
stimuli. If this description were apt, then the perceptual process could- be 
described more conventionally, in quite different terms. Listeners might 
merely have memorized the tune of J&he tones without any phonetic recognition; 
and, after inferring^a prosodic schema from the amplitude contour preserved in 
the tonal pattern, listeners would then have been free to guess (or, rather, 
to hypothesize ) . a likely phonetic sequence for the utterance using "top-down" 
finesse. A. number of views of the perception of fluent speech include a 
prominent faculty for best-guessing lexical patterns from the prosodic st*uc- 
ture when the phonetic stimulus is defective. “(Jr ambiguous (e.g., Cutler & 
Foss, 1977;. Huggins, 1978; Nakatani & Schaffer, 1978). Perhaps the listeners 
in our original study relied on such guesswork for transcribing the stimulus, 
and did- not immediately perceive the message from phonetic structure preserved 
in the time-varying tonal pattern. In that case, very little phonetic 
perception would have occurred, and our theoretical claim would, need ■•to be 
moderated.' 



In the tes£ we report’ here, each listener was presented with a sinusoidal 
pattern replicating the sentence. "Where .were you a year ago?" Ift response, 
the listener reported two things: (1) £ transcription- of the sentence; and 
, (2) a count of the syllables in the ser^ence. If phonetic information is 
preserved in trie coherence of the ‘changing sinusoids;** -then transcription 
performance should be no poorer* than syllable’ coupting, which would presumably 
Be based here on the .linguistic structure ofN the message. If, on the 
contrary, only prosodic ^ITfformation in the form of' amplitude variation is 
readily available tp theAistener, then syllable counting should be much more 
accurate than transcription of; the .message. ’ In this latter condition, 
subjects would be likely to vary in the particular phonetic guesses they make 
given that 3n infinity of sentences may conform to the same prosodic pattern. 
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The , present test also included a stimulus manipulation to evaluate more 
directly the difference between perceiving the phonetic structure and guessing 
about it based on amplitude information about prosody. Four conditions wer^^ 
„ used. In the first, listeners gave .their two responses to a sinusoidal/^ 

pattern that preserved both peak-frequency and peak-amplitude change of the 
first three formants of the original, natural utterance (see Figure 3). In, 
the second condition, listeners heard a pattern that pr£served the frequency 
variation of the first three formant center-frequencies at a constant level of 
energy throughout the utterance (see Figure 4). in the third condition, the 
sinusoidal pattern preserved the frequency pattern* of the first three for- 
mants, but with a grossly misleading ■'amplitude contour containThgt^^four 
segments of high energy and five segments of low energy, h\gh ancfTow-^. 
^differing by approximately 20dB (see Figure 5). The fourth condition employed • 
T » a sinusoidal pattern with the original formant amplitude variati-on but withjjno 
frequency variation (see Figure 6'). If the coarse amplitude structure of the 
stimuli provides reliable prosodic structure, and if subjects rely on this 
spurce of information about .the message, then syllable counting should be 
accurate in conditions 1 and 4, and poorer in ■ conditions 2 and 3. In 
addition, the Accuracy of transcription • should follow the accuracy of count- 
ing. If subjects perceive the phonetic sequence based on the time— varying 
properties of frequency variation, however, transcription and counting should 
. be good in all conditions but * the fourth,, in which there is no frequency 

.variation. * 



r 



• Our results are straightforward, as Figure 7 depicts. Transcription was 
good in conditions 1 (n=14), 2 (n=13) and 3 (n=12); there was no statistical 
effect of the amplitude manipulation in these condTtT^n^. This indicates that 
- subjects were not ’hindered by defective coarse acousticN str uc ture when fine 
acoustic structure was available for phonetic perception. (Condition 4 was 
not scored for transcription, for the obvious reason that'there was nothing 
phonetic to transcribe.) In the syllable counting task, there was an enormous 
difference between condition 4 (no frequency variation, appropriate amplitude 
variation) and the other three conditions (appropriate frequency variation 
with either normal, flat, or misleading amplitude variation). A post hoc 

cjpans test confirmed that' this effect is highly significant (Sdheffe, p<.001). 
Subjects were clearly unable to derive syllable information solely from 
amplitude variation in this case (cf. O'Malley & Peterson, 1966). -j. 



{ 



We conclude from these results that sinuspidal signals do not consist of 
veridical prosodic information ^nd' defective ^coustic-phonetic information. 
Listeners lacked, the ability to follow the syllable structure when "'&nly the 
amplitude variation, of the original transcribable pattern was preserved, yet 
they were able to apprehend the phonetic detail 'everTwhen. the energy contour 
was grossly inappropriate to .the segments within it. It seems that listeners 
who transcribed thesfe sinusoidal replicas of speech, must-have relied on e 
information about the phonetic sequence available in the f req uency vsriation 
alone. f 



Overall, these studies of sinusoidal signals contribute • new knowledge 
about phonetic perception that is perhaps counterintuitive. That is, phonetic 
perception can be elicited solely by a coherent pattern of acoustic variation 
. comprising elements that cannot, in principle, be realized vocally. In order 
to detect this coherence despite unproducible short-time spectra, listeners 
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“Where were you a year ago?" '* N 

NORMAL AMPLITUDE' 



— Figure 3 . Display of waveform, energy and frequency change of three-tone 
. replica of "Where were you a year ago?" Stimulus condition 1. 
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Figure 4. Stimulus condition 2: variation in the frequency of the three 

.tones at a constant energy level'. 
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“Where were you a year ago?” 



MISLEADING AMPLITUDE 
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Figure 5 . Stimulus condition 3: variation in ' the frequency of the three 

tones with a prosodically misleading amplitude pattern. 
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Figure 6. Stimulus condition 4: no frequency variation with the prosodically 

appropriate amplitude patt'ern. > ' 
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must ultimately rely on^ even more abstract and more forgiving knowledge of 
vocal tracts ‘than has been proposed by 'Liberman (1979)* We venture to Say 
that; phonetic * perception may actually be based on attention to the coherent 
patterns of change * in acoustic energy rather than on attention to the. 
particular 1 qualities of the successive, discrete' acoustic elements that 
compose the speech signal. To refine our speculation, we must extend .this 
technique to a wider phonetic repertoire; to a more varied test of short-time 
spectral, properties that permit the 'effect to occur; and to ^manipulations of 
the coherence of change directly. ^ ^ * - 

* REFERENCES 

S 

Bailey, ^P. J., Summerfield, A. Q., <& Dorman, M. On the identification of 

* sine-wave analogues of -certain speech sounds. Haskirfe Laboratories 

Status Report on Speech Research , 1977, SR-51 /52 , 1-25. 

Best, C. T., Morrongiello , B., & Robson, R. 'Perceptual equivalence of 

acoustic cues in speech and nonspeech perception. Perception & 

s Psychophysics , 1.981, 2$^, 191-211 .’ " 

Cutler, A., & Foss, D. J. On the role of sentence stress in sentence- 

‘processing . ' Language - and Speech , 1977, 20, 1-10. 

Cutting, J. E* - Two left-hemisphere mechanisms in speech perception. 

Perception % Psychophysics , 1974, J6^, 601-6124 
Fant, G. Acoustic analysis and synthesis of speech with applications to 
Swedish.’ Ericsson Technics *, 1959, 1 5 , 3-108. 

Grunke ,° M. E. , fisoni^ I). P. Perceptual learning of mirrpr- image acoustic 
6 patterns. In E. Fischer-J^rgenson., J. Rischel,- Sc N.' .Thorsen j(Eds.), 

Proceedings of the Ninth International Congress of Phonetic Sciences 
(Vol. 2). Copenhagen: Institute of Phonetics, 1979, 461-467. 

1 Huggins, A. W. £. Speech timing and intelligibility.* In *3. Requin (Ed.), 

. Attention and performance VI 2. Hillsdale, N.jVf Lawrence Erlbaum Asso- 
’ dates, 1978, 279-298. * ^ 

Liberman,- A. M. How abstract must a motor theory .of speech perception be? 

, w Revue de Phonetique Applique^ , 1 979, 49/50 , 41 -58. ,* > ^ 

ft$katani, L. H. , <& Schaffer, J. A. Hearing ^words^-vi thout 4 words: Prosodic 

cufes for word perception'. Journal of the Acoustical * Society 6f America , 
1978, 63, 234-245. ; X ^ ", 

O’Malley, M. H. r <& Peterson, G. E. An experimental method .for prosodic 
' analysis. 'Phonetica, 1966, 1 5 , 1—13* 7 . 

Remez, R. E.:, Rubin, P. E. , Pisoni, D. B., & Carrell, T. D..- Speech perception 
•without traditipnal speech cues. Science , 1981, 21 2 /^947-950. 



7 



FOOTNOTES 



To our knowledge^ no one ^claims that the properties of a talker’s 
utterances "necessary to perception are supplied in the auditory channel, 
though such a view cannot be excluded a priori. * , 

^A very small number of listeners did recognize, some phonetic properties 
of the stimuli. 
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MEMORY FOR ITEM ORDER AND *PHO#ETIC RECODING IN THE BEGINNING READER* 



Robert B. K^tz*+ Donald Sl^nkweiler ,+ and Isabelle Y. Liberman+ 

e> \ 

i 

Abstract . A defect in immediate mdfoor-y for item order is often 
attributed to poor beginning readers. We have supposed that this 
' problem may be a manifestation of an underlying deficiency in the 
use of phonetic-' codes. Accordingly, we expected good and poor 

readers to differ in their ability to order stimuli that can be 
easily recoded as words and scored in phonetic form, but not,, in 
their ability t.o order nonlinguistic stimuli that do not lend 
themselves to phonetic recodipg in short-term memory. The purpose 
- • the present study was to tes}> this hypothesis by examining the 

ability of good and poor reader/] to reconstruct the order of sets of 
, , briefly presented stimuli that-^y^Pled i n the extent to which they 

* ^ could be distinctively recoded into phonetic " form : pictures of 

common objects versus nonrepr esentational , "doodle" drawings. As 
expected, an interaction between reading ability and type of stim- 
ulus item- was found, demonstrating the' material-specific nature of 
pooh readers’ ordering difficulties. These findings support the 
hypothesis that a function of the phonetic representation is to aid 
in retention of order information and that poor readers' ordering 
difficulties are related to their deficient use of phonetic codes. 

'•3 <J 

Certain commonly occurring memory problems^of poor beginning readers have 
been regarded as manifestations of an underlying deficiency in the use of 
phonetic codes. Several studies have shown that children who are poor readers 
tend to make ineffective ' use of phonetic coding in short-term recall of 
linguistic material (Liberman, Shankweiler, Liberman, Fowler, & Fischer r 1977; 

• Mann, Liberman, & Shankweiler., - 1980; Shankweiler, Liberman, Mark, Fowler,. A 
Fischer, 1979). However, special difficulties frith recall and recognition 
t. arise only when the stimulus items 'are words or other items that ca.n readily 
be labeled linguistically and retained phonetically in working memory (Hoimes 
& McKeever, 1979; Vellutino, Pruzek, Steger, A Meshoulam, 1973-;’ Vellutino, 
Steger, & Kandel, 1972). When the stimuli do not lend themselves to phonetic 
coding, Jthe performances of good and poor readers cannot be distinguished. 
For example, we (Liberman,. Mann, -Shankweiler, A Werfelman, Note 1) r Jested" 
recognition memppy^with two sets of stimuli that could, not be easily labeled; 



0 




*To appear in Journal of Exper imental Child Psychof^gy. 

+Also University of Connecticut. " 

Acknowledgment . This investigation was supported by NICHD Grant HD-0T99# and 
BRS Grant RR-05596 to Haskins Laboratories. We, are grateful to the, prindip^l 
arid teachers of the Parker Memorial School, Tolland, Connecticut for 
-allowing us to work with the second-grade classes. We '■are also grateful to 
the children who participated and their parents for their cooperation. 
Special thanks are due to Leonard Katz for statistical advice.. 

[HASKINS LABORATORIES: Status Report on Speech Research SR-66 (1981)] ** A 

- - . - • ' ' • ' ' 67 , 



V 



unfamiliar faces and abstract, nonrepresentational line drawings (Kimura, 
1963). ‘It was found that good and "poor readers were indistinguishable on 

memory, for both faces' and nonsense drawings. 

* > , «*■ 

‘The question we ask here is whether .children’ s memory for the order of 
occurrence of stimulus items would also vary with their phonetic recodability. 
Repeatedly, the literature has suggested that poor readers have difficulty in 
.retaining the order of items in tests of serial recall (Bakker, 1972; Benton, 

, 19^j Corkin, 1974). There are indi6ations, as we noted, that the poor 
r.elaerS' deficits in item recall may' be a manifestation of their deficient 
abilitj^to use phonetic c^des. We should now ask whether the deficits they 
might haife in remembering the order of stimuli would also vary with the 
phonetic Recodability of the items. This is what we would expect in light of 
suggestions that one function of phonetic memory codes is to preserve item 
order (Bad'deley,. 1978.; Crowder, 1978). Consequently, we would suppose that 
the poor-. reader*"s difficulty . in retaining order information is material- 
specific and not a" global Memory deficit for item order. 



To pursue this question experimentally, we needed to discover how poor 
. readers would fare with order memory for nonlinguistic material. While it is 
true that some studies (Corkin, 1974; Noelker 4 Schunsky, 1973; Stanley, 
, Kaplan, 4 Poole, 1975) have reported inferior performance by poor readers in 
ordering nonlinguistic stimuli, the interpretation of the findings in each 
case is open to some question either because the items used were such as to be 
readily labeled or were presented for long exposure times. In either 
instance, even though the stimuli presented, were nonlinguistic, the effect of 
the procedure might be to accentuate the differences in performance between 
the reader groups by encouraging linguistic recoding on the part of the good 
readers who habitually recode phonetically. Moreover, good and poor readers 
have. been found to be equivalent in ordering other nonlinguistic items, such 
as photographed faces (Holmes 4 McKeever, 1979). At all events, there has 
b'een no direct test of tfte hypothesis that the .poor readers’ problem with 
order memory may be linked to a deficiency in the use of phonetic codes. The 
present experiment was designed to provide direct evidence for such a link. 
By controlling for the ease with which linguistic labels can be given to test 
iterils, we expected tp find that differences in the performances of good and 
poor readers woul£ depend on the phonetic recodability of the stimulus 
material. 

# 

9 

The experiment compared good and poor readers’ memory order for two 
sets. or controlled stimuli: a set^consisting of items that are easily labeled 

— line drawings of -common objects, and a set containing items - presimed to be 
very difficult to label — Kimura’ s (T 963 ) nonsense drawings. The latter were 
chosen foV use in this study because gootf and poor readers performed eqGally 
well with these stimuli in the test of recognition memory to which we referred 
earlier (Liberman et al., Note 1). 7 



In the present procedure, a linear array of five figures is' 
tachistoscopically presented, after which copies of the five figures are 
presented, on cards, orfe figure per card, in random *<?rder. Subjects are asked 
to rearrange" the cards, reconstructing the ojj'der in. tRe previous displayV - 
Since ( poor readers tend not to, make full use of phonetic coding in working^ 
memory, we expected them to be less accurate. than good readers in ordering the 
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phonetically recodable pictures of common objects, but not to differ from the 
good readers in ordering the nonrecodable , doodle drawings. Thus we expected 
an interaction between reading ability and stimulus type, attributable to 
differences in the decree of reliance on phonetic recoding. 



METHOD 

j* 

Subjects , , . ' ' 

I * 

Subjects were selected from four second-grade classes in the Tolland, 
Connecticut public school System. Candidates for the- poor reader group were 
selected for screening if they were so designated by their teachers or if they 
scored at the 40th percentile or lower oh both word recognition subteSts of 
the^ 0 Comprehensive Test of Basic Skills (CTBS) (1974), which had been 
administered in the seventh month of the first grade. Candidates for the good 
reader group either received a -superior evaluation from the teachers or ranked 
at or above the 80th percentile on both CTBS subtests. 



f 



^ Subjects^ - s ^ ected fo r screening * were administered the Slosson 
In-telligence Tejy -(Slosson, 1963) and the word identification and the word 
attack subtests of the Woodcock Reading Mastery Tests (Woodcock, 1973) in the 
fifth and sixth months 'of the school year. The final good reader group 
consisted of those subjects who attained a combined raw score of at least 115 
on the two Woodcock subtests, while the poor Reader group included subjects 
with a combined score of less'than 85. Subjects with extreme 4 IQ scores (below 
90 or above 135 )' were- ineligible for further testing. In'addition, one poor 
reader had to be dropped because of prolonged absence and ensuing scheduling 
difficulties. By these criteria, 21 good readers (10 females; 11 males) and 
21 poor .readers (7 females, 14 males) were selected, the good readers had a 
mean age of 95.1 months compared to the poor readers' mean age of 97.2 months,. * 
v t(40) = 1.7; £ = .10. The good readers had a mean IQ of 115-3 while the poor- 
readers had a mean IQ of 107.4, t(40) = 2.7; £ = .012. Theraean combined raw 
score on the Woodcock was 134.6 for the good readers (range: 118»to 153) and 

53.0 for the poor readers (range: 22 to 77 ). 

Stimuli an'd Apparatus • 

Two sets of 50 \dr awing s comprised the stimuli of this study. The first 
set consisted oDt.he 5Q nonsense drawings of Kimura C 1 963 ) , which we designate 
phonetically unrecodable" because they are difficult to labe£, distinctively. 
The second set, which we call "phonetically recodable," included 50 line 
drawings of common objects. The latter had been shown- in earlier pilot 
3tudies to be easi-l.y recognized* by second graders, each drawing typically 
eliciting a single response which was a monosyllabic word. Each stimulus 

condition required 20 test trials'.. Each trial consisted of a tachistoscopic 
presentation of a different horizontal array of five stimuli mounted on 2 .x 2 
inch slides. To generate the, required 20 arrays for each condition, 10 arrays 
were selected by random drawing without replacement from the set of 50 stimuli 
_for that condition. Then 10 more arrays w e^e generated by a second drawing 
’■’for each stimulus condition.. One set of three ‘stimuli not used in the test 
trials was prepared to be used as practice trials. A sample . array .* for • 
each stimulus condition is displayed in Figure 1 . ' 
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Figure .1. 




the upplr portion of the figure gives a sample stif^lii%rarray 
consisting of five nonrepresentational line, drawings (adapted from 
Kimura, 1963) for vtoich ready verbal labels are not available. »The 
lower portion gives a sample array for the coraparisofr-condition in 
which the items are 'easily named common pbjects (adapted from. 
MakSr, 1969). ' ** ' 
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